Search results for "Binary data"
showing 10 items of 13 documents
A comparison of HDFS compact data formats: Avro versus Parquet
2017
In this paper, file formats like Avro and Parquet are compared with text formats to evaluate the performance of the data queries. Different data query patterns have been evaluated. Cloudera’s open-source Apache Hadoop distribution CDH 5.4 has been chosen for the experiments presented in this article. The results show that compact data formats (Avro and Parquet) take up less storage space when compared with plain text data formats because of binary data format and compression advantage. Furthermore, data queries from the column based data format Parquet are faster when compared with text data formats and Avro. Article in English. HDFS glaustųjų duomenų formatų palyginimas: Avro prieš Parquet…
Computation of the area in the discrete plane: Green’s theorem revisited
2017
International audience; The detection of the contour of a binary object is a common problem; however, the area of a region, and its moments, can be a significant parameter. In several metrology applications, the area of planar objects must be measured. The area is obtained by counting the pixels inside the contour or using a discrete version of Green's formula. Unfortunately, we obtain the area enclosed by the polygonal line passing through the centers of the pixels along the contour. We present a modified version of Green's theorem in the discrete plane, which allows for the computation of the exact area of a two-dimensional region in the class of polyominoes. Penalties are introduced and …
Isobaric Vapor−Liquid Equilibria for Water + Acetic Acid + Lithium Acetate
2001
Isobaric vapor−liquid equilibria for all of the binary and ternary mixtures of water, acetic acid, and lithium acetate have been measured at 100.00 kPa using a recirculating still. To take into account the association of the acetic acid in the vapor phase, Marek's chemical theory has been considered. The three experimental binary data sets have been independently correlated using Mock's electrolyte NRTL model, and the binary parameters estimated for each binary system have been used to predict the ternary vapor−liquid equilibrium using the same model. No ternary parameters were required. The ternary equilibrium values obtained in this way agreed well with the experimental values.
The problem of interoperability: A common data format for quantum chemistry codes
2007
A common format for quantum chemistry (QC), enhancing code interoperability and communication between different programs, has been designed and implemented. An XML-based format, QC-ML, is presented for representing quantities such as geometry, basis set, and so on, while an HDF5-based format is presented for the storage of large binary data files. Some preliminary applications that use the format have been implemented and are also described. This activity was carried out within the COST in Chemistry D23 project “MetaChem,” in the Working Group “A meta-laboratory for code integration in ab initio methods.” © 2007 Wiley Periodicals, Inc. Int J Quantum Chem, 2007
III: „Relatives Risiko” und „NNT” - anschauliche Maße für binäre Daten
2002
Description of categorial data can often be based on contingency tables. However, percentages appearing in such tables must be meaningful: For most applications, it may be useful to employ factors of causal influence as the row entry variable and relate percentages to sub-groups defined by these row entries ("row percentages"). The comparison of success frequencies (i. e. binary information on "therapy success yes/no") may be based on two therapies, respective success frequencies and their ratio, the relative risk. In addition the success frequencies' difference, the absolute (or excess) risk, can be transformed into the "number needed to treat" (NNT). Many international journals demand thi…
Accelerating data queries on Hadoop framework by using compact data formats
2016
There are massive amounts of data generated from IoT, online transactions, click streams, emails, logs, posts, social networking interactions, sensors, mobile phones and their applications etc. The question is where and how to store these data in order to provide faster data access. Understanding and handling Big Data is a big challenge. The research direction in Big Data projects using Hadoop Technology, MapReduce kind of framework and compact data formats such as RCFile, SequenceFile, ORC, Avro, Parquet shows that only two data formats (Avro and Parquet) support schema evolution and compression in order to utilize less storage space. In this paper, file formats like Avro and Parquet are c…
<title>Fiber optic monitoring buses with binary on-off sensors</title>
1999
This paper presents general theoretical considerations of complex-structure optical fiber networks (buses) with binary `on-off' fiber optic sensors and fiber optic transmission lines for monitoring, diagnostic or measurement systems. The principles of fiber optic serial and parallel buses and various types of intensity fiber optic binary sensors are described as well as the advantages and disadvantages of the individual types of networks. The choice of the use of fiber optic technology rather than other techniques is discussed. Special emphasis was put on the role and function of optoelectronic and optical fiber devices in harsh environments. Theoretical considerations are illustrated by th…
Adaptive Techniques for Microarray Image Analysis with Related Quality Assessment
2007
We propose novel techniques for microarray image analysis. In particular, we describe an overall pipeline able to solve the most common problems of microarray image analysis. We pro- pose the microarray image rotation algorithm (MIRA) and the statis- tical gridding pipeline (SGRIP) as two advanced modules devoted to restoring the original microarray grid orientation and to detecting, the correct geometrical information about each spot of input mi- croarray, respectively. Both solutions work by making use of statis- tical observations, obtaining adaptive and reliable information about each spot property. They improve the performance of the microarray image segmentation pipeline (MISP) we rec…
Progressive transmission of secured images with authentication using decompositions into monovariate functions
2014
International audience; We propose a progressive transmission approach of an image authenticated using an overlapping subimage that can be removed to restore the original image. Our approach is different from most visible water- marking approaches that allow one to later remove the watermark, because the mark is not directly introduced in the two-dimensional image space. Instead, it is rather applied to an equivalent monovariate representation of the image. Precisely, the approach is based on our progressive transmission approach that relies on a modified Kolmogorov spline network, and therefore inherits its advantages: resilience to packet losses during transmis- sion and support of hetero…
A generalization of the Binomial distribution based on the dependence ratio
2015
We propose a generalization of the Binomial distribution, called DR-Binomial, which accommodates dependence among units through a model based on the dependence ratio (Ekholm et al., Biometrika, 82, 1995, 847). Properties of the DR-Binomial are discussed, and the constraints on its parameter space are studied in detail. Likelihood-based inference is presented, using both the joint and profile likelihoods; the usefulness of the DR-Binomial in applications is illustrated on a real dataset displaying negative unit-dependence, and hence under-dispersion compared with the Binomial. Although the DR-Binomial turns out to be a reparameterization of Altham's Additive-Binomial and Kupper-Haseman's Cor…